欢迎来到第7课,我们将介绍 迁移学习。该技术通过复用已在大规模通用数据集(如ImageNet)上训练过的深度学习模型,并将其适配到新的具体任务(如我们的FoodVision挑战)中来解决问题。在标注数据集有限的情况下,它对于高效实现顶尖性能至关重要。
1. 预训练权重的力量
深度神经网络以层次化的方式学习特征。底层学习基础概念(边缘、角点、纹理),而深层则将这些组合成复杂概念(眼睛、车轮、特定物体)。关键洞察在于,早期学到的基础特征在大多数视觉领域都具有 普遍适用性 跨大多数视觉领域均适用。
迁移学习的组成部分
- 源任务: 在1400万张图像和1000个类别上进行训练(例如,ImageNet)。
- 目标任务: 将权重调整以分类一个更小的数据集(例如,我们特定的FoodVision类别)。
- 被利用的部分: 网络参数中的绝大多数——即特征提取层——被直接复用。
效率提升
迁移学习显著降低了两大资源障碍: 计算成本 (你无需花费数天时间训练整个模型)以及 数据需求 (仅需数百个而非数千个训练样本即可达到高精度)。
TERMINALbash — pytorch-env
> Ready. Click "Run" to execute.
>
TENSOR INSPECTOR Live
Run code to inspect active tensors
Question 1
What is the primary advantage of using a model pre-trained on ImageNet for a new vision task?
Question 2
In a Transfer Learning workflow, which part of the neural network is typically frozen?
Question 3
When replacing the classifier head in PyTorch, what parameter must you first determine from the frozen base?
Challenge: Adapting the Classifier Head
Designing a new classifier for FoodVision.
You load a ResNet model pre-trained on ImageNet. Its last feature layer outputs a vector of size 512. Your 'FoodVision' project has 7 distinct food classes.
Step 1
What is the required Input Feature size for the new, trainable Linear Layer?
Solution:
The Input Feature size must match the output of the frozen base layer.
Size: 512.
The Input Feature size must match the output of the frozen base layer.
Size: 512.
Step 2
What is the PyTorch code snippet to create this new classification layer (assuming the output is named `new_layer`)?
Solution:
The output size of 512 is the input, and the class count 7 is the output.
Code:
The output size of 512 is the input, and the class count 7 is the output.
Code:
new_layer = torch.nn.Linear(512, 7)Step 3
What is the required Output Feature size for the new Linear Layer?
Solution:
The Output Feature size must match the number of target classes.
Size: 7.
The Output Feature size must match the number of target classes.
Size: 7.